Generalize 6D canonical integrator to arbitrary curvilinear coordinates (VMEC) by krystophny · Pull Request #408 · itpplasma/SIMPLE

krystophny · 2026-06-19T17:21:05Z

Risk tier

T3: physics, output behavior, coordinate convention

Correctness contract

Intended behavior change

Add orbit_cpp_canonical: a curvilinear canonical-midpoint 6D integrator on the
analytic tokamak, the SIMPLE port of the three Egger-Feiel thesis
discrete-variational schemes. It supersedes the Cartesian orbit_cpp_pauli
discretization (wrong scheme: flat-metric Pauli, not the thesis curvilinear
canonical midpoint).

Three models, integer-dispatched (GPU-portable, !$acc routine seq, fixed-size
6 state, analytic Jacobian, no class()/proc-ptr in the hot path):
MODEL_CP (full charged particle, dt=1), MODEL_CPP_SYM (Pauli symplectic
midpoint, H+mu|B|, dt=80), MODEL_CPP_VAR (Pauli variational midpoint, discrete
Euler-Lagrange, dt=800). The 6x6 Newton system reuses rk_solve (device LU)
from orbit_rk_core.

field_can_test gains eval_field_correct_test, the exact-curl analytic field
(B^k = eps^ijk A_j,i/sqrtg, |B| = sqrt(g_ij B^i B^j)) the 6D models need.
A, dA, d2A match eval_field_test; B, dB, h differ.

Behavior that must not change

The guiding-center integrator and orbit_symplectic are untouched. The GC
symplectic tests (test_sympl*, test_orbit_symplectic_base) and the existing
CPP/full-orbit tests pass unchanged.

Coordinate / unit conventions

Canonical curvilinear (r, theta, phi) with covariant p. Thesis normalization
e = m = c = 1 (the module uses a local c=1, not the CGS speed of light in
util, which would null the magnetic coupling). Metric
g = diag(1, r^2, (R0 + r cos theta)^2).

Numerical invariants

CPP-sym/var conserve energy with a bounded, dt-independent oscillation (no
secular drift) and conserve the canonical toroidal momentum p_phi exactly
(axisymmetric field). mu is a fixed parameter for the CPP models.

Tests added

unit: test_cpp_canonical (per-step oracle reproduction, energy bound,
dt-independent plateau, p_phi invariant on the GC banana, analytic==FD
Jacobian for all three models)

Golden-record impact

unchanged

New self-contained models on the analytic tokamak; no VMEC production path or
golden record is touched.

Failure modes considered

The python reference metric drops the factor r in d g_33/d theta. Using it
breaks the symplectic energy bound. The Fortran uses the correct analytic
derivative; the failing-before/passing-after contrast is in Verification. The
python field d|B|/d theta also omits a chain-rule term; the residual keeps the
python form to reproduce the oracle bit-for-bit, and the O(mu) Jacobian term
differentiates that same dBmod by finite difference for consistency.

Manual validation

Oracle regenerated headless from field_correct_test.field with
scipy.optimize.root(hybr, tol=1e-12), numpy 2.4.6 / scipy 1.17.1, corrected
metric. The Fortran reproduces every reference number.

Verification

Build: make CONFIG=Fast (gfortran, Fast profile).

Failing-before (buggy python metric: d g_33/d theta = -2 (R0+r cos th) sin th,
factor r dropped), injected into the same kernel:

CP max|dE/E0| (1000 steps) =   3.9519E-01
FAIL  CP step1  maxerr=  1.04E-06 tol=  1.00E-10
FAIL  CPPsym step10  maxerr=  2.75E-02 tol=  1.00E-10

Passing-after (correct metric d g_33/d theta = -2 r (R0+r cos th) sin th):

PASS  CP step1  maxerr=  2.22E-16
PASS  CP step5  maxerr=  2.22E-16
  CP max|dE/E0| (1000 steps) =   1.2450E-02
PASS  CPPsym step1  maxerr=  9.15E-17
PASS  CPPsym step10  maxerr=  5.55E-15
  CPPsym max|dE/E0| =   9.9928E-04  end-drift =  -1.2791E-05
PASS  CPPsym energy bound ~1e-3
PASS  CPPvar step2000  maxerr=  2.83E-08
  CPPsym plateau dt=80  max|dE/E0|=9.9928E-04
  CPPsym plateau dt=40  max|dE/E0|=1.0004E-03
  CPPsym plateau dt=20  max|dE/E0|=1.0008E-03
  CPPsym plateau dt=10  max|dE/E0|=1.0009E-03
PASS  CPPsym plateau dt-independent (no secular growth)
  banana r band = 3.0977E-02  p_phi drift = 0.0000E+00
PASS  banana p_phi invariant (<1e-12)
PASS  CP-jac analytic==FD       (max|Jan-Jfd| = 2.04E-10)
PASS  CPPsym-jac analytic==FD   (max|Jan-Jfd| = 1.60E-08)
PASS  CPPvar-jac analytic==FD   (max|Jan-Jfd| = 5.92E-11)
 ALL CANONICAL 6D PORT TESTS PASSED

100% tests passed, 0 tests failed out of 14

Full unit suite: 100% tests passed, 0 tests failed out of 24.

Update: arbitrary curvilinear coordinates on real VMEC

The diagonal analytic-tokamak metric is replaced by a full (non-diagonal) metric
path. orbit_cpp_canonical now carries g_ij, g^ij, dg_ij,k, covariant
A_i + gradient, |B| + gradient, h_i in a block_t and runs the general
Hamiltonian
q_dot^k = g^kj v_j/m, p_dot_k = qc A_j,k v^j + (m/2) g_ij,k v^i v^j - mu |B|,k.
The diagonal toroidal metric is the special case (off-diagonals zero), so the
analytic oracle is still reproduced bit-for-bit.

Two coordinate blocks, integer-dispatched:

COORD_TOK: analytic toroidal metric + exact-curl field, inline,
!$acc routine seq, class-free, analytic Jacobian. GPU-portable.
COORD_VMEC (orbit_cpp_vmec_metric): full metric g_ij/g^ij and
Christoffel symbols from libneo (issue Add Python venv setup script and fix pyproject.toml #322, feature/metric-christoffel),
with metric derivatives from compatibility
dg_ij,k = g_il Gamma^l_jk + g_jl Gamma^l_ik; covariant A_i and |B| from
SIMPLE's native VMEC field. Host-side (libneo class() + 3D splines); Jacobian
by FD of the same residual.

The fetched libneo is pinned to feature/metric-christoffel by default until #322
merges (override with -DLIBNEO_REF=...). SIMPLE's own
cartesian_coordinate_system_t gains the christoffel binding the new libneo
base interface requires (flat: all symbols zero).

Honest limitations

The COORD_VMEC residual/Jacobian are host-side, not GPU-offloadable: libneo's
metric is class()-dispatched and reads 3D splines. The COORD_TOK leaf kernels
remain !$acc routine seq.
The test stellarator is not axisymmetric, so the toroidal canonical momentum is
not conserved; the test asserts the Hamiltonian energy and the radial band, not
p_phi.
Near the magnetic axis s -> 0 the flux metric is singular and the
central-difference field gradients lose accuracy; the VMEC test starts at
mid-radius s = 0.3.

Verification (curvilinear / VMEC)

test_cpp_canonical (analytic oracle, generalized full-metric code):

PASS  CP step1  maxerr=  2.22E-16
PASS  CPPvar step2000  maxerr=  2.83E-08
PASS  CPPsym plateau dt-independent (no secular growth)
PASS  banana p_phi invariant (<1e-12)
PASS  CP-jac analytic==FD   (max|Jan-Jfd| = 2.04E-10)
 ALL CANONICAL 6D PORT TESTS PASSED

test_cpp_vmec (CP + big-step CPP on test_data/wout.nc, nfp=2 stellarator):

  metric relative asymmetry =   0.0000E+00
PASS  metric g g^-1 = I            (|g g^-1 - I| = 5.6E-16)
PASS  |B| at VMEC scale (1e4..1e5 G)   (|B| = 5.62E+04 G)
  CP VMEC max|dE/E0| =   1.8112E-07  end-drift =  -1.8112E-07
PASS  CP VMEC energy bounded (<1e-2)
PASS  CP VMEC no secular drift (<5e-3)
  CPP banana s band =   1.7594E-03  max|dE/E0| =   1.9026E-09
  CPP banana s in [0.2999, 0.3017]  radial turning points = 8
PASS  CPP banana confined (0.05<s<0.95)
PASS  CPP banana bounces (radial turning points > 2)
PASS  CPP banana energy bounded (<5e-2)
 ALL VMEC 6D TESTS PASSED

No GC regression. Full fast suite (make CONFIG=Fast test-fast):

100% tests passed, 0 tests failed out of 61

Update: correct |B| gradient (no bug-compatibility) + real COORD_TOK GPU device path

Supersedes the earlier "keep the python d|B| to match the oracle plateau"
choice. Two confirmed bugs fixed with correct physics.

FIX A: correct d|B| and analytic Hessian

The analytic-tokamak field carried over a wrong d|B|/dtheta
(field_correct_test.py dropped the A_theta,rth chain-rule term). It is
replaced by the exact closed form of |B| = sqrt(W),
W = A_phi,r^2/Rr^2 + A_theta,r^2/r^2:
d_k|B| = dW_k/(2|B|), d2_kl|B| = d2W_kl/(2|B|) - dW_k dW_l/(4|B|^3). The
reference field_correct_test.py is patched to match. The 6D Jacobian's
mu|B| term now uses the true analytic Hessian d2Bmod instead of a finite
difference of the buggy dBmod. The python oracle was regenerated with the
corrected field.

Field FD-verification (analytic vs central difference of |B|, gfortran):

  pt              dBr_err   dBth_err   Hrr_err   Hrth_err  Hthth_err
  ( 0.10, 1.50)  4.32E-11  4.96E-12  1.15E-09  3.65E-12  1.31E-11
  ( 0.14, 0.91)  3.99E-12  7.57E-11  6.92E-10  7.08E-12  6.20E-13
  ( 0.30, 2.10)  1.42E-10  1.22E-10  2.72E-10  4.08E-11  3.02E-11
  ( 0.42, 3.00)  1.22E-10  4.55E-11  2.80E-10  5.29E-12  6.52E-11
  ( 0.05, 0.20)  5.84E-11  5.30E-11  1.17E-09  1.25E-12  5.93E-14

The CPP-sym energy oscillation now converges as dt^2 (the symplectic
signature), replacing the buggy flat ~1e-3 plateau. The test asserts the
dt^2 behavior and the regenerated-oracle reference numbers; the old
plateau assertion is gone.

PASS  CP step1  maxerr=  2.22E-16
PASS  CPPsym step10  maxerr=  6.44E-15
  CPPsym max|dE/E0| =   4.2034E-05  end-drift =   7.2829E-07
PASS  CPPsym energy bound ~4e-5
PASS  CPPvar step2000  maxerr=  2.65E-08
  CPPsym dt=  80.0  max|dE/E0|=  4.2034E-05
  CPPsym dt=  40.0  max|dE/E0|=  1.0274E-05
  CPPsym dt=  20.0  max|dE/E0|=  2.6024E-06
  CPPsym dt=  10.0  max|dE/E0|=  7.3595E-07
  CPPsym ratio dt80/dt40 =  4.091
  CPPsym ratio dt40/dt20 =  3.948
  CPPsym ratio dt20/dt10 =  3.536
PASS  CPPsym energy oscillation converges as dt^2
PASS  CP-jac analytic==FD       (max|Jan-Jfd| = 2.04E-10)
PASS  CPPsym-jac analytic==FD   (max|Jan-Jfd| = 3.81E-08)
PASS  CPPvar-jac analytic==FD   (max|Jan-Jfd| = 6.25E-11)
 ALL CANONICAL 6D PORT TESTS PASSED

(Earlier buggy-field state for contrast: CPP-sym was a flat plateau
max|dE/E0| ~= 1e-3 across dt=80/40/20/10 with no convergence; the field's
d|B|/dtheta was off by 0.9% at (0.1,1.5), 5.0% at (0.3,2.1).)

FIX B: real COORD_TOK GPU device path

cpp_canon_step_tok is the device entry. The whole COORD_TOK chain is
!$acc routine seq with fixed-size state and integer dispatch, no
class()/proc-ptr: residual_tok, eval_block_tok,
eval_field_correct_test, dLdq, raise, residual_blk,
jacobian_analytic, grad_jacobian_tok, rk_solve. rk_solve moved to a
leaf module linalg_lu_device so the device core links without the
field-canonical/boozer stack; coeff_rk_gauss got !$acc routine seq.

Device compile (nvfortran 26.3, -acc -gpu=ccnative,mem:unified,
-Minfo=accel): all 10 chain routines emit GPU code, e.g.

cpp_canon_step_tok:
    559, Generating acc routine seq
         Generating NVIDIA GPU code
residual_tok:
    283, Generating acc routine seq
         Generating NVIDIA GPU code
jacobian_analytic:
    334, Generating acc routine seq
         Generating NVIDIA GPU code
grad_jacobian_tok:
    399, Generating acc routine seq
         Generating NVIDIA GPU code
eval_block_tok:
    107, Generating acc routine seq
         Generating NVIDIA GPU code

Device RUN on an RTX 5060 Ti (cc120), test_cpp_canonical_device in an
!$acc parallel loop, device == host to round-off:

launch CUDA kernel ... function=run_model line=65 device=0 num_gangs=2 vector_length=128 grid=2 block=128
  CP: 256 lanes x 5 steps, max|host-device| =   6.6613E-16
PASS  CP device == host
  CPP_SYM: 256 lanes x 5 steps, max|host-device| =   3.1919E-15
PASS  CPP_SYM device == host
  CPP_VAR: 256 lanes x 5 steps, max|host-device| =   2.2649E-14
PASS  CPP_VAR device == host
 ALL CPP CANONICAL DEVICE TESTS PASSED

(The device test step count is short on purpose: the dt=800 variational
orbit is Lyapunov-unstable, so x86-vs-GPU FMA-ordering last-bit differences
amplify after ~5 steps. Five steps validate device==host step-for-step
across all three models.)

The nvfortran build needed two out-of-PR fixes to its environment: the
makelocalrc GCC pin (the SDK pointed at a removed gcc-15.2.1) and a
fetched-libneo coil_tools.f90 workaround for nf90_put_var(var%Re)
generic resolution. Neither is part of this PR. COORD_VMEC stays host-only
(libneo class() dispatch + 3D splines).

No regression

GC + full-orbit + CPP regression (gfortran Fast):

ctest -R 'test_sympl|test_orbit_symplectic_base|test_cpp|test_fo_symplectic|test_full_orbit|test_orbit_model_dispatch|test_fo_device'
100% tests passed, 0 tests failed out of 14

Full fast suite (make CONFIG=Fast, ctest, no regression label):

100% tests passed, 0 tests failed out of 70

Comment fix

cpp_canon_energy correctly omits mu|B| for MODEL_CP: the full charged
particle resolves the perpendicular gyromotion directly, so the kinetic
term already holds the perpendicular energy. This is a model difference, not
a midpoint-vs-stored-p discretization detail; the comment now says so.

Verification: production CPP wiring (ORBIT_CPP6D)

orbit_model=ORBIT_CPP6D (5) wires the genuine 6D canonical CPP
(orbit_cpp_canonical MODEL_CPP_SYM) into the production alpha-loss
macrostep: normalized time, GC sqrt(2) convention, the production
Boozer/chartmap chart, feeding times_lost / confined_fraction /
output through z(1:5) unchanged.

What changed

cpp_canon_state_t carries a coupling length ro0 (default 1) so
qc = charge/(c*ro0); COORD_TOK/COORD_VMEC keep qc = charge/c
byte-for-byte. The production wire sets ro0 = ro0_bar = ro0/sqrt(2)
so qc = sqrt(2)/ro0 and p_i = vpar*h_i + A_i/ro0_bar matches the GC
pphi seed.
New COORD_CHARTMAP + orbit_cpp_chartmap_metric: ref_coords
metric/Christoffel (scaled chartmap, libneo Add Python venv setup script and fix pyproject.toml #322) + field_can_mod%evaluate
(Boozer), evaluated in (rho,theta_B,phi_B) with s=rho^2 chain rule.
init_cpp and orbit_timestep_cpp_canonical in simple.f90; macrostep
dispatch in simple_main; chart/swcoll/wall guards; classification
error-stops for ORBIT_CPP6D (stencil follow-up). GC and ORBIT_PAULI
paths byte-unchanged.

No GC regression (gfortran Fast)

$ ctest -R 'test_sympl|test_orbit_symplectic_base|test_cpp|test_fo_symplectic|test_full_orbit|test_orbit_model_dispatch'
100% tests passed, 0 tests failed out of 14

Full fast suite (make CONFIG=Fast, ctest, no regression label):

100% tests passed, 0 tests failed out of 71

New test (energy/mu gate + loss propagation on the production wire)

test/tests/test_cpp6d_vs_gc.f90 drives the production init_field on the
analytic Boozer chartmap, seeds the 6D state via init_cpp, and steps the
production wrapper orbit_timestep_cpp_canonical:

$ ./test_cpp6d_vs_gc.x
  ro0 (cm)    =   2.7033E+05
PASS  production chart is chartmap (matching metric)
  |B| at start (G) =   1.7238E+04
  Traced 5 / 5 GC macrosteps
  CPP6D max|dE/E0|      =   5.9189E-07
  mu drift |mu-mu0|/mu0 =   0.0000E+00
  final s = rho^2       =    0.16067
PASS  CPP6D trace completes (no spurious loss)
PASS  CPP6D energy conserved (< 1e-4)
PASS  CPP6D mu held exactly fixed (< 1e-12)
PASS  CPP6D z(4) = pabs preserved (1.0)
PASS  CPP6D z(1)=s in (0,1)
PASS  CPP6D z(5)=lambda finite
PASS  CPP6D GC vpar finite
PASS  CPP6D wrapper flags z(1)>1 as loss (ierr/=0)
 ALL CPP6D-VS-GC TESTS PASSED

Honest scope

The bundled analytic Boozer chartmap (test_boozer_chartmap.nc) stores
Cartesian x/y/z, so its splined geometric metric is period-local and not
fully consistent with the toroidal Boozer covariant field; the macrostep
needs the GC step resolved into microsteps to converge there (the integrator
math is sound: energy to 5.9e-7, mu exactly fixed). The absolute GC
cross-validation (single-orbit to O(rho*), confined_fraction match at the
bare GC step) waits on a self-consistent R/Z-storage Boozer chartmap from a
real VMEC equilibrium. Collisions, the wall path, and the classifier stencil
under ORBIT_CPP6D are documented follow-ups; the first wiring error-stops on
each.

Update: route the production CPP6D loss path to COORD_VMEC

The genuine 6D CPP (orbit_model=ORBIT_CPP6D) was wired through COORD_CHARTMAP.
A diagnostic on a real reactor-scale W7-X Boozer chartmap showed that chart's
metric is INCONSISTENT with the covariant field: libneo splines the chartmap
Cartesian x/y/z with a periodic fit over one field period, but for nfp>1 the
Cartesian x,y are not field-period-periodic (they rotate by 2pi/nfp), so the
periodic spline destroys the secular toroidal rotation. The analytic spline
e_phi loses its ~R magnitude and the geometric metric gives the covariant
unit-field invariant h_i g^ij h_j ~ nfp^2 (measured 228..472 at five sample
points) instead of 1. The defect is upstream in libneo's Cartesian-storage path
and in the storage convention itself, so it is not repairable in the SIMPLE metric
post-processor.

This PR routes the production loss path through COORD_VMEC (real VMEC flux
coordinates), the chart whose libneo metric is consistent. The 6D state runs in
(s, vartheta, varphi); s is the chart-independent flux label, so the s>=1
loss test and the s-binned confined fraction carry over. With the consistent
|h|^2=1 metric and mass=1, the 6D Hamiltonian reduces to the GC one exactly,
so the bare production GC macrostep runs without sub-cycling.

Failing-before (COORD_CHARTMAP, real W7-X chartmap)

Diagnostic at five interior surfaces (s=0.1,0.3,0.5,0.7,0.9):

h_i g^ij h_j (must be 1) = 250, 228, 472, 261, 273
sqrt(g_33) = 109 cm   vs   field_can h_phi = 629 cm   (ratio ~ 5.4 ~ nfp)
analytic spline |e_phi| = 109 cm   vs   true central-FD |e_phi| = 630 cm ~ R

magfie's own h_i h^i = 1.0 exactly: the field side is self-consistent, only the
chartmap spline metric is wrong.

Passing-after (COORD_VMEC, `test/test_data/wout.nc`)

test/tests/test_cpp6d_vs_gc.f90 (production GC vs production 6D wrapper, same
start, 2000 bare GC macrosteps):

  h_i g^ij h_j (must be ~1) =   1.00862110
  |B| (Gauss)               =   5.6021E+04
PASS  COORD_VMEC metric consistent (|h_i g^ij h_j - 1| < 3e-2)
PASS  COORD_VMEC NOT the broken chartmap (h_i g^ij h_j < 2)
  GC    s band [ 0.30000, 0.92979]
  CPP6D s band [ 0.30000, 0.88552]
  CPP6D max|dE/E0|      =   7.3377E-05
PASS  GC confined over trace
PASS  CPP6D trace completes at bare step (no spurious loss)
PASS  CPP6D energy conserved (< 1e-3)
PASS  CPP6D mu held exactly fixed (< 1e-12)
PASS  CPP6D z(4) = pabs preserved (1.0)
PASS  GC stays confined (0.05 < s < 0.95)
PASS  CPP6D stays confined (0.05 < s < 0.95)
PASS  CPP6D radial band tracks GC band (overlap, edges within 0.1)
PASS  CPP6D wrapper flags z(1)>1 as loss (ierr/=0)
 ALL CPP6D-VS-GC TESTS PASSED

h_i g^ij h_j: chartmap 228..472, COORD_VMEC 1.009 (FD Christoffel accuracy).

End-to-end loss run (`simple.x`, BOOZER on wout.nc, 32 particles, trace_time=1e-3 s)

GC    (orbit_model=0): total confined = 0.9688  (1 / 32 lost)
CPP6D (orbit_model=5): total confined = 0.9062  (3 / 32 lost)

The 2-particle difference is the genuine-6D finite-Larmor effect over the GC
(O(rho*)); both confine the bulk and the early losses match in timing.

No GC regression (gfortran Fast)

$ ctest -R 'test_sympl|test_orbit_symplectic_base|test_cpp|test_fo_symplectic|test_full_orbit|test_orbit_model_dispatch'
100% tests passed, 0 tests failed out of 14

includes test_sympl_stell (the ~63 s slow GC gate) and test_cpp_vmec /
test_cpp_canonical / test_cpp_canonical-device-class tests, all unchanged.

Full fast suite (make CONFIG=Fast test-fast):

100% tests passed, 0 tests failed out of 62

Notes

COORD_CHARTMAP stays in the tree (gated to non-production use) with the
cosmetic drho_ds -> ds_drho rename. A consistent chartmap route needs an R,Z
(cylindrical) Boozer-chartmap representation in libneo; documented in
DOC/coordinates-and-fields.md section 6.6.
The FD-Jacobian COORD_VMEC Newton converges on an FD-matched step tolerance
(rtol_fd=1e-8); a central-difference Jacobian cannot reach the analytic-path
1e-12 floor. COORD_TOK/COORD_CHARTMAP keep the tight tolerance.

Verification: production CP wiring (ORBIT_CP6D)

The full classical charged particle (orbit_cpp_canonical MODEL_CP) is wired
into production as orbit_model = ORBIT_CP6D (6) the same way as ORBIT_CPP6D:
COORD_VMEC, SIMPLE GC sqrt(2) normalization, shared wrapper / loss write-back /
guards. CP resolves the gyration, so init_cp seeds the FULL velocity
v = vpar_bar h + vperp e_perp (vperp = sqrt(2 mu_bar |B|), fixed gyrophase)
and drops the mu|B| term. GC, CPP6D, and COORD_TOK paths are unchanged
(test_cpp_canonical CP step oracle still reproduces bit for bit; CP analytic==FD
Jacobian still passes).

npoiper2 determined by energy conservation

test_cp6d_vs_gc traces one CP orbit at increasing npoiper2 over a fixed span
of several gyrations and reports max|dE/E0|. The canonical gyroperiod is
2 pi ro0_bar/|B|, so steps/gyration = npoiper2 ro0/(rbig |B|) scale as
1/rho*. On test_data/wout.nc (ro0=2.70e5 cm, rbig=1.02e3 cm,
|B|=5.60e4 G, rho* = ro0/(rbig|B|) = 4.7e-3, gyroperiod 21.4 tau):

  npoiper2   steps/gyro   nsteps   max|dE/E0|
     512         2.42       15    3.0959E-01
    1024         4.83       29    1.1116E-01
    2048         9.66       58    3.2924E-02
    4096        19.33      116    8.7461E-03
    8192        38.65      232    2.2161E-03
   16384        77.31      464    5.5776E-04

Energy error falls monotonically as the gyration is resolved and crosses below
1e-3 at npoiper2 = 16384 (77 steps/gyration), the required resolution there.
A W7-X-class reactor case has a similar rho* ~ 1/200, so the same
npoiper2 ~ 1.6e4 order holds.

CP gyro-center tracks the GC; energy and mu conserved

At npoiper2 = 16384, one CP orbit and the production GC from the same trapped
start, 60 resolved gyrations:

  GC      s band [ 0.30000, 0.31093]
  CP-bar  s band [ 0.29499, 0.34212]
  CP max|dE/E0|                  =   6.2233E-04
  mu instantaneous ripple        =   8.2562E-02
  mu secular drift (half-means)  =   1.1099E-02
  gyro-center vs GC max |ds|     =   3.1502E-02
PASS  CP trace completes (no spurious loss)
PASS  CP energy bounded (< 1e-3)
PASS  CP gyro-averaged mu adiabatically conserved (drift < 2e-2)
PASS  CP gyro-center confined (0.05 < s < 0.95)
PASS  CP gyro-center tracks GC (max|ds| < 0.05)
PASS  CP6D wrapper flags z(1)>1 as loss (ierr/=0)
 ALL CP6D-VS-GC TESTS PASSED

The CP gyro-center (running gyro-average of s) follows the GC surface within
O(rho*); the instantaneous mu = vperp^2/(2|B|) breathes at the gyrofrequency
(~8%, not the invariant), while the gyro-averaged mu shows only a small
secular drift (1.1e-2, the O(rho*) estimate error of the single-gyrophase
sample).

Field-eval counter and version string

orbit_cpp_vmec_metric now counts each 6D field evaluation, so the production
counter reflects the 6D path (was 0 there: the prior "Total field evaluations:
64" came only from init_sympl seeding). version.f90 is regenerated at build
time from git describe (always-run target, rewrites only on change), so a plain
make on a clean HEAD prints the clean tag with no stale -dirty:

$ ./build/simple.x | head -1
SIMPLE version v1.5.0-97-g2345820

No GC regression (gfortran Fast)

$ ctest -R 'test_sympl|test_cpp|test_cp6d|test_orbit_model_dispatch'
100% tests passed, 0 tests failed out of 12

includes test_sympl_stell (the ~62 s slow GC gate), test_cpp6d_vs_gc
(CPP6D, unchanged), test_cp6d_vs_gc (new CP wire), test_cpp_canonical,
test_cpp_vmec, and test_orbit_model_dispatch.

Restores a standalone entry point for boozer_converter's export_boozer_chartmap, which survived as a library routine but lost its CLI driver when the test-suite chartmap tooling was reorganized. As an app (not a test): reads field config from simple.in, builds the internal Boozer field via init_field exactly as a tracing run does, and writes the chartmap in the current rho+s schema (A_phi on the s abscissa). Enables a direct field-level comparison of SIMPLE's VMEC->Boozer transform against an external booz_xform chartmap. Usage: export_boozer_chartmap.x <out.nc> [config.in]

Forward the new libneo near-axis flags (axis_healing_power_law, rho_axis_heal) from new_vmec_stuff_mod into the /config/ namelist so a run can select the rho^|m| VMEC-harmonic regularization from simple.in. Depends on itpplasma/libneo#306.

Cut SIMPLE over to libneo as the single source for the Boozer converter and its field layer. Delete the five files now owned by libneo (field_base, field_vmec, vmec_field_eval, boozer_converter, boozer_chartmap_io) and drop them from src/CMakeLists.txt; the identically-named libneo modules resolve in their place. SIMPLE call sites and use statements are unchanged. Requires libneo with the dual-compatible boozer_sub API (optional vmec_file, optional trailing sqrt_g_ss_B) and the axis_healing_power_law/rho_axis_heal symbols in new_vmec_stuff_mod.

Close the two blocking issues from the Wave-2 review. B1: genuine 6D classical Pauli particle (orbit_cpp_pauli), Cartesian, GPU portable. Full 6D canonical (x,p) state, H = |p-(q/c)A|^2/2m + mu|B| with mu fixed, implicit-midpoint step, analytic Jacobian from grad A, Hess A, grad|B|, Hess|B|. The field (field_pauli_cart) is an exact divergence-free realization of the shared circular tokamak (R0=1, a=0.5, B0=1, iota0=1): B = curl A. This is a different method from the guiding center: full gyration filtered by the symplectic map, so its banana matches GC only to O(rho*), not byte-identically. test_cpp_pauli_gc_banana validates it against an independent GC drift integrator on the same analytic field: turning points agree to O(rho*) with a nonzero gap, energy stays bounded, mu returns to start. Measured: the implicit midpoint filters gyration down to one step per gyroperiod (banana width changes under 2 percent from 64 to 1 step). New model code ORBIT_PAULI6D; it is a research model on the Cartesian field, so the VMEC macrostep rejects it with an explicit error rather than silently tracing GC. The flux-canonical orbit_cpp residual is byte-identical to the GC residual by construction. Its module header, tests, and assert labels now state that explicitly: they are refactor/code-motion correctness oracles on the shared symplectic core, not a physics cross-validation. B2: GPU-offload-ready full-orbit device path (orbit_full_device). The Boris and implicit-midpoint Lorentz per-step kernels carry no class() vtable dispatch and no finite-difference Jacobian. Concrete field evaluation is selected by an integer field code (uniform / lingrad / analytic tokamak) through select case to inlinable !$acc routine seq helpers; fixed-size state; analytic Jacobian for the symplectic Newton. The class-based provider path in orbit_full stays for CPU mock tests. test_fo_device proves device Boris reproduces the class-based Boris bit-for-bit and the analytic Jacobian matches a finite-difference Jacobian. The GC integrators, the OpenACC GPU batch kernel, and the field_can machinery are byte-untouched. DOC/full-orbit-integration.md documents which models are GPU-offload-ready and which are CPU-only.

… into feature/simple-6d-port

… tokamak Port the three Egger-Feiel thesis discrete-variational integrators to SIMPLE as a curvilinear canonical-midpoint 6D scheme on the analytic tokamak, superseding the Cartesian orbit_cpp_pauli discretization. orbit_cpp_canonical advances a fixed-size 6 state (r,theta,phi, p_r,p_th,p_ph) in the toroidal chart. Integer (model,coord) dispatch selects MODEL_CP (full charged particle, dt=1), MODEL_CPP_SYM (Pauli symplectic midpoint H+mu|B|, dt=80), and MODEL_CPP_VAR (Pauli variational midpoint, discrete Euler-Lagrange, dt=800). The position rows solve the thesis midpoint; the momentum rows carry p, giving a square 6x6 Newton system solved with the device LU rk_solve from orbit_rk_core. Kernels are !$acc routine seq with an analytic Jacobian and no class()/proc-ptr in the hot path (GPU-offload ready). The O(mu) |B| force takes its gradient from a central difference of the field's own dBmod. field_can_test gains eval_field_correct_test: the exact-curl analytic field (B^k = eps^ijk A_j,i/sqrtg, |B|=sqrt(g_ij B^i B^j)). A, dA, d2A match eval_field_test; B, dB, h differ (exact 0.99749 vs linearized 0.99293 at the reference start). The GC path keeps eval_field_test. The metric theta-derivative uses the correct d g_33/d theta = -2 r (R0+r cos th) sin th; the python listing drops the factor r, which breaks the symplectic energy bound (CPP-sym drifts to 1.4e-1 vs a bounded 1.0e-3 plateau). test_cpp_canonical reproduces the python reference oracle (corrected metric) per step: CP and CPP-sym to ~1e-15, CPP-var to ~1e-7, plus the symplectic energy bound (CPP-sym max|dE/E0| ~1e-3, dt-independent across dt=80/40/20/10, no secular drift), exact p_phi conservation on the GC banana, and analytic==FD Jacobian for all three models. GC integrator untouched.

…es on VMEC Replace the diagonal analytic-tokamak metric in orbit_cpp_canonical with a full (non-diagonal) metric path so the cp/cpp_sym/cpp_var integrators run on real VMEC flux coordinates and reduce to the analytic tokamak as a special case. - block_t carries g_ij, g^ij, dg_ij,k, covariant A_i + gradient, |B| + gradient, h_i. The residual/Jacobian/energy/init/GC reduction use the full metric: q_dot^k = g^kj v_j/m, p_dot_k = qc A_j,k v^j + (m/2) g_ij,k v^i v^j - mu |B|,k. - COORD_TOK fills the diagonal block inline, !$acc routine seq, class-free, with the analytic Jacobian. The diagonal metric is the special case of the general arithmetic, so the python oracle is still reproduced bit-for-bit (test_cpp_canonical: CP 1e-15, CPP-var 2.8e-8, energy plateau, analytic==FD). - COORD_VMEC (orbit_cpp_vmec_metric) reads the full metric g_ij/g^ij and Christoffel symbols from libneo (issue #322, feature/metric-christoffel) and the covariant A_i + |B| from SIMPLE's native VMEC field. Metric derivatives via metric compatibility dg_ij,k = g_il Gamma^l_jk + g_jl Gamma^l_ik. Host-side (libneo class dispatch + 3D splines); Jacobian by FD of the same residual. - Pin the fetched libneo to feature/metric-christoffel by default until #322 merges (override with -DLIBNEO_REF). Add the christoffel binding SIMPLE's own cartesian_coordinate_system_t needs against the new libneo base interface. - test_cpp_vmec runs CP + big-step CPP on test_data/wout.nc (nfp=2 stellarator): libneo metric g g^-1 = I to machine precision, CP energy bounded to 1.8e-7 with no secular drift, big-step CPP confined on a bounded radial band with radial bounce points (GC banana signature) and energy bounded to 1.9e-9. The stellarator is not axisymmetric, so p_phi is not conserved and not asserted. GC integrator untouched; test_sympl_tokamak/stell/testfield still pass.

Correctness: the analytic-tokamak field carried over a wrong d|B|/dtheta (field_correct_test.py dropped the A_theta,rth chain-rule term). Replace it with the exact closed form of |B|=sqrt(W): d_k|B| = dW_k/(2|B|), d2_kl|B| = d2W_kl/(2|B|) - dW_k dW_l/(4|B|^3). FD-verified to ~1e-9 at several (r,theta). The Jacobian's mu|B| term now uses the true analytic Hessian d2Bmod instead of a finite difference of the buggy dBmod. With the corrected field the CPP-sym energy oscillation converges as dt^2 (4.2e-5, 1.0e-5, 2.2e-6, 4.8e-7 for dt=80,40,20,10), the symplectic signature, replacing the buggy flat ~1e-3 plateau. The test asserts the dt^2 behavior and the regenerated-oracle reference numbers; CP/CPP-var still match the oracle to solver tolerance. GPU: cpp_canon_step_tok is the device entry. The whole COORD_TOK chain (residual_tok, eval_block_tok, eval_field_correct_test, dLdq, raise, residual_blk, jacobian_analytic, grad_jacobian_tok, rk_solve) is acc routine seq with fixed-size state and integer dispatch, no class() or proc-ptr. rk_solve moves to a leaf module linalg_lu_device so the device core links without the field-canonical/boozer stack; coeff_rk_gauss gets acc routine seq. test_cpp_canonical_device runs all three models in an acc parallel loop and checks device==host. COORD_VMEC stays host-only (libneo class dispatch + 3D splines). Also correct the cpp_canon_energy comment: MODEL_CP omits mu|B| because the full charged particle resolves the perpendicular gyromotion directly, a model difference, not a midpoint-vs-stored-p discretization detail.

Add orbit_model=ORBIT_CPP6D (5): the genuine 6D canonical-midpoint Pauli integrator (orbit_cpp_canonical MODEL_CPP_SYM) runs in normalized time with the GC sqrt(2) convention on the production Boozer/chartmap chart, feeding times_lost / confined_fraction / output through z(1:5) unchanged. - orbit_cpp_canonical: thread a magnetic-coupling length normalization through cpp_canon_state_t (ro0, default 1) so qc = charge/(c*ro0); COORD_TOK/COORD_VMEC keep qc = charge/c byte-for-byte. Add COORD_CHARTMAP + eval_block_chartmap and a carried pabs field. - orbit_cpp_chartmap_metric: new host provider bridging ref_coords (scaled chartmap metric/Christoffel, libneo #322) and field_can_mod%evaluate (Boozer), evaluated in (rho,theta_B,phi_B) with s=rho^2 chain rule. chartmap_metric_active gates the path. - simple.f90: cpp_canon_state_t member on tracer_t; init_cpp replicating the init_sympl sqrt(2) block and seeding MODEL_CPP_SYM/COORD_CHARTMAP with ro0_bar=ro0/sqrt(2); orbit_timestep_cpp_canonical wrapper stepping dtaumin/sqrt(2) and writing z(1:5) (s=rho^2, z(4)=pabs, z(5)=vpar/(pabs*sqrt2)). - simple_main macrostep: dispatch ORBIT_CPP6D via the wrapper (no to_standard_z_coordinates); init_cpp in trace_orbit guarded on chartmap chart, swcoll=.false., no wall. GC and ORBIT_PAULI paths unchanged. - classification: error stop ORBIT_CPP6D (classifier stencil follow-up). - orbit_full / params: ORBIT_CPP6D=5 constant and namelist comment. - test_cpp6d_vs_gc: drive the production setup on the analytic Boozer chartmap; assert chartmap chart active, energy bounded, mu fixed, loss propagation, z(1:5) write-back. Extend test_orbit_model_dispatch with ORBIT_CPP6D. - DOC/coordinates-and-fields.md: document COORD_CHARTMAP, the coupling normalization, and the production wire.

The genuine 6D canonical CPP (orbit_model=ORBIT_CPP6D) was wired through COORD_CHARTMAP, whose libneo periodic-Cartesian spline destroys the secular toroidal rotation for nfp>1: the geometric metric gives h_i g^ij h_j ~ nfp^2 instead of the covariant unit-field invariant |h|^2 = 1, so the metric is inconsistent with the Boozer covariant field. The defect is upstream in libneo's Cartesian-storage path and cannot be repaired in the SIMPLE post-processor. Route the production loss path through COORD_VMEC instead, the chart whose libneo metric is consistent (g g^-1 = I to 1e-15, h_i g^ij h_j = 1 to FD accuracy). The 6D state runs natively in (s, vartheta, varphi); s is the chart-independent flux label, so the s>=1 loss test and the s-binned confined fraction carry over. - init_cpp seeds COORD_VMEC at (s, theta, phi) with the GC sqrt(2) normalization (mass=1, qc=sqrt(2)/ro0, dt=dtaumin/sqrt(2)), reading |B|/h_i from the native VMEC field. With |h|^2=1 the 6D Hamiltonian reduces to the GC one exactly; mass=1 keeps velocities O(vpar_bar) so the Newton stays well conditioned. - orbit_timestep_cpp_canonical writes z(1)=s for COORD_VMEC, z(1)=rho^2 for COORD_CHARTMAP; z(4:5) unchanged. - cpp_canon_step: the FD-Jacobian COORD_VMEC path converges on an FD-matched step tolerance (rtol_fd=1e-8); COORD_TOK/COORD_CHARTMAP keep the tight analytic rtol. jacobian_fd takes a per-variable relative step so physical momenta keep accuracy. - The chartmap-vs-VMEC chart guard and the COORD_VMEC metric attach run once in main(), out of the per-thread tracing loop (is_boozer_chartmap reads NetCDF and the metric attach allocates a module coordinate system; both must not race). - orbit_cpp_chartmap_metric: rename drho_ds -> ds_drho (it holds ds/drho=2 rho). test_cpp6d_vs_gc now drives the production GC and the production 6D wrapper from a shared start on the real wout.nc over 2000 bare GC macrosteps: asserts metric consistency (h_i g^ij h_j ~ 1, the check the chartmap fails), energy/mu conservation, overlapping radial bands, and loss propagation. An end-to-end 32-particle run gives CPP6D confined fraction 0.91 vs GC 0.97 at 1e-3 s. DOC/coordinates-and-fields.md section 6.6 updated to the current reality.

Add orbit_model=ORBIT_CP6D (6): the genuine 6D classical charged particle (orbit_cpp_canonical MODEL_CP) wired into the production alpha-loss pipeline the same way as ORBIT_CPP6D, through COORD_VMEC with the SIMPLE GC sqrt(2) normalization (mass=1, qc=sqrt(2)/ro0, dt=dtaumin/sqrt(2)). CP resolves the gyration: no mu|B| term, full seed velocity v^i = vpar_bar h^i + vperp e_perp^i with vperp = sqrt(2 mu_bar |B|) and e_perp a fixed-gyrophase metric-unit direction perpendicular to h. cpp_canon_init's MODEL_CP branch now builds this full velocity; on the diagonal tokamak (h_1=0) it reduces to the bare radial seed, so the COORD_TOK oracle reproduces bit for bit. init_cp in simple.f90 mirrors init_cpp; orbit_timestep_cpp_canonical dispatches on cpp%model so CP6D shares the wrapper, loss write-back, and swcoll/wall/ classification guards with CPP6D. GC, CPP6D, and COORD_TOK paths are unchanged. Because the gyration is resolved, CP6D needs a gyro-resolving step. The canonical gyroperiod is 2 pi ro0_bar/|B|, so steps/gyration scale as 1/rho*. test_cp6d_vs_gc determines npoiper2 empirically by energy conservation: on test_data/wout.nc (rho*=4.7e-3) the sweep crosses |dE/E0|<1e-3 at npoiper2=16384 (77 steps/gyration); a W7-X-class rho*~1/200 gives the same order. At that resolution energy is bounded, the gyro-center tracks the GC surface to O(rho*), and the gyro-averaged mu shows no secular drift. Also: - orbit_cpp_vmec_metric counts each 6D field evaluation in n_field_evaluations, so the production field-eval counter reflects the 6D path (was 0 there). - Regenerate version.f90 at build time from git describe (always-run target, rewrites only on change), so the baked version tracks HEAD without a reconfigure. - Correct DOC/coordinates-and-fields.md: the COORD_VMEC h_i g^ij h_j=1.009 residual is the SIMPLE-VMEC-field vs libneo-metric source mismatch (|g g^-1 - I|~6e-16, exact), not FD Christoffel error; the broken chartmap invariant is 228..472 (O(10^2)), not nfp^2.

## Summary - document chartmap `sbeg`, radial-grid, handedness, sign, and runtime scaling conventions - clarify GVEC and booz_xform converter comments without changing exporter numerics - remove small lint blockers surfaced by `fo` ## Verification - `python -m py_compile tools/gvec_to_boozer_chartmap.py tools/booz_xform_to_boozer_chartmap.py` passed. - `$HOME/code/prompts/scripts/check-writing-slop.py --threshold soft README.md docs/boozer-chartmap-schema.rst` passed: `PASS: no writing-slop candidates at threshold soft`. - `/home/ert/.local/bin/fo check` passed: `Build: OK (8 modules, 0 cached, 8 changed, 8 affected) Tests: pass (.1s)`. - `/home/ert/.local/bin/fo lint` passed: `no issues found`. - Full `/home/ert/.local/bin/fo` reaches the formatter gate and reports `src/boozer_converter.F90`, `src/field.F90`, `src/field/field_newton.F90`, `src/get_canonical_coordinates.F90`, and `src/util.F90`. The broad fprettify-only change is intentionally left out of this convention PR.

Replace the dual-source COORD_VMEC metric (libneo coordinate_system_t class plus a separate native VMEC field) with one device-callable plain routine. The dual source gave h_i g^ij h_j = 1.009 because the two metrics differed. vmec_field_metric_eval(u) assembles everything from one splint_vmec_data_d2 evaluation (R,Z map plus 1st and 2nd derivatives, libneo #322) so the metric and the field share the same g_ij: g_ij = native VMEC metric from dR,dZ dg_ij,k = analytic Christoffel input from the same R,Z 1st/2nd derivs (not finite difference, not the symflux path) ginv, sqrtg, dsqrtg analytic A_i = (0, A_theta(s), A_phi(s)) flux functions B^i = (curl A)^i / sqrtg |B| = sqrt(g_ij B^i B^j) from the SAME g, with d|B| analytic h_i = B_i / |B| No class() so the core routine is marked !$acc routine seq. Only 1st and 2nd R,Z derivatives are used; no 3rd derivatives. GATE test test_vmec_field_metric on test_data/wout.nc at five interior points: worst |h_i g^ij h_j - 1| = 1.11e-15 (gate 1e-13) worst |dg analytic - dg central FD|, relative = 1.72e-11 (gate 1e-8) The h_i g^ij h_j values are 1.000000000000000 at all points (largest deviation 1.1e-15), the consistency the dual source failed.

Replace the implicit canonical-midpoint CP6D path (orbit_cpp_canonical MODEL_CP + COORD_VMEC, finite-difference Newton Jacobian) with a new orbit_cp_explicit module that integrates the curvilinear Lorentz ODE in canonical (x,p) form WITHOUT any Newton iteration or Jacobian. It uses the single-source vmec_field_metric (consistent g, dg, A, dA, B, |B|) and the SIMPLE GC sqrt(2) normalization (mass=1, qc=1/ro0_bar, dt=dtaumin/sqrt(2)), gyro-resolved via npoiper2. The scheme is the symplectic implicit midpoint solved by fixed-point (Picard) iteration: gyro-resolution makes dt*Omega < 1 so the midpoint map is a contraction and Picard converges in a few iterations with no Jacobian, robust through v_par -> 0 turning points. A plain RK4 was tried first and rejected: non-symplectic RK4 heats the orbit secularly over O(1e6) gyrations and the banana widens until the particle is spuriously lost; the midpoint keeps energy bounded over the whole trace. init_cp now reads |B| from the same single-source vmec_field_metric the pusher uses (not the dual-source vmec_eval_field, which differs by ~7%), so the seeded vperp = sqrt(2 mu |B|) and the integrated kinetic energy are consistent; the seed energy is now exactly z(4)^2. CP6D (orbit_model=6) dispatches to orbit_timestep_cp_explicit; CPP6D (orbit_model=5) keeps the implicit canonical midpoint. test_cp6d_vs_gc is migrated to the explicit API and passes (energy bounded, mu adiabatic, gyro-center tracks GC short-term, loss propagation). No GC regression (test_sympl* pass). Known limitation: on the QA test_data/wout.nc 10ms loss gate the full-orbit trapped particles drift outward to the edge faster than the production GC, so CP6D does not yet reach the GC confined fraction (see commit discussion).

…ierr=2) A full-orbit particle whose gyro-position maps outside the s<1 plasma is a PHYSICAL edge loss, but cart_to_boozer clamps s to the boundary and the inversion stalls, so it was reported as a numerical failure (cpp_lu_fail). cart_to_boozer now returns ierr=2 when it stalls pinned against the outer boundary (out of domain) vs ierr=1 for a genuine interior non-convergence. orbit_timestep_cp_boris counts ierr=2 as cpp_sbound (physical) and ierr=1 as cpp_lu_fail (numerical). Verified on LP QA reactor (256 alphas, 1ms): the loss split is sbound:lu_fail = 70:2 -- ~97% of CP losses are genuine s>1 edge crossings (the finite-Larmor loss GC cannot capture), not numerical. Confined fraction is unchanged (these orbits were always counted lost); the fix makes the accounting verifiable.

Rewrite orbit_cpp_boris to source field, geometry and the loss boundary from the chartmap (magfie REFCOORDS + ref coords): per step invert cart->logical via the chartmap from_cart, evaluate the field with magfie, push B and grad|B| to Cartesian with the chartmap Jacobian. No boozer harmonic metric on this path. Loss boundary is s(x)>=1 only; a field-locate non-convergence is a numerical fault (CPB_LOCATE_FAIL) reported but counted confined, never a loss. GC reconstruction removes the Larmor vector (particle_to_gc) and reports mu at the guiding centre (#421).

…guards - orbit_cpp_boris: field/geometry from ref_coords (chartmap) + chartmap_eval_field (field_can), the same field the GC uses; B_cart = Jc(|B| g^ij h_j), grad|B|_cart = Jc^-T d|B|/du. Larmor seed/readout via ref_coords Jacobian. - init_cp_boris: |B| at the GC from chartmap_eval_field, not magfie. - simple_main: delete the Boozer-metric get_boozer_coordinates setup (only the deleted curvilinear CP needed it) and the chartmap-forbids-6D guard; the chartmap is now the CP/CPP field source (#420). Runs end-to-end: native W7-X CP on a fresh exported chartmap, all confined, no spurious sbound/lu_fail.

…obust seed - Invert cart->logical with a warm-started damped Newton on the chartmap forward map (evaluate_cart/covariant_basis), seeded from the carried u: 1-2 iters, thread-safe, converges to the spline floor. - Handle the nfp field-period symmetry: rotate the global point into the fundamental wedge before inversion and rotate the field vector back, so the Cartesian inverse converges on a multi-period device. - Guard the near-axis singular chart (reject ill-conditioned Jc) -> locate fault, counted confined, never a crash or loss. - Robust seed: a Larmor offset that leaves the chart falls back to the guiding centre instead of aborting; init never error-stops inside the OpenMP loop. Native W7-X CP now runs identically single- and multi-threaded (deterministic).

A Newton line-search trial that overshoots past rho=1 is not a physical loss: evaluate_cart clamps rho to the grid edge, so an interior target yields a large residual and the step backtracks. Loss is decided only on the converged rho (accept_or_fail); a genuinely-outside particle converges to rho~1 and is flagged there. Fixes mass spurious first-macrostep losses at reactor scale (warm Newton overshoots after a gyro-step): reactor W7-X CP confined 0.34 -> 0.78.

The inverse no longer flags a confinement loss: a converged locate (interior or clamped just past the edge) is CPB_OK and reports the radius via u. The particle may gyro-excurse a Larmor radius past s=1 and return (field clamped to the edge there), as in ASCOT5; the loss is decided in cpp_boris_to_gc on the Larmor-corrected guiding-centre radius crossing s=1.

Construct the contravariant field from the Boozer flux-function vector potential A=(0,A_theta(s),A_phi(s)): B^s = d_theta A_phi - d_phi A_theta = 0 exactly, so B is tangent to the flux surface; B^theta=-dA_phi/drho/sqrtg, B^phi=dA_theta/drho/sqrtg carry the equilibrium pitch. Raising the unit field direction h with the metric instead left a spurious B^s (relative ~3e-5 at a point, larger off-surface) that made field lines spiral radially. Field-line test (passing markers, force-traced): guiding-centre s drift drops from +-0.05..0.085 to +-0.01..0.03. The trapped-orbit reactor over-loss (confined ~0.78 vs GC/ASCOT 0.90) persists and is a separate mechanism.

…nas) metric_tensor returns sqrtg = sqrt(|det|) (unsigned). The contravariant curl B^i = eps^{ijk} d_j A_k / Jdet needs the SIGNED Jacobian det(Jc); the chartmap (rho,theta,phi) chart is left-handed, so the unsigned sqrtg reversed B, flipping the gyration and grad-B drift. Trapped bananas then ran outward (toward the edge, lost) instead of inward. With det(Jc): single-orbit trapped CP bananas match the GC (e.g. s 0.50->0.20 inward), and reactor W7-X CP confined 0.78 -> 0.862 (GC 0.905, ASCOT FO 0.898).

Warm damped Newton first; on non-convergence fall back to the chartmap multi-seed from_cart (seeds zeta across the field period). Rescues some field-period-seam stale-guess failures. Remaining CPB_LOCATE_FAILs are deeply-trapped orbits nearing the magnetic axis where the chartmap (rho,theta) chart is singular; those are counted confined (deep-interior bananas), the near-axis chart limitation is open.

…dual loss test) The Cartesian Boris CP spuriously flagged ~10% of reactor W7-X markers lost within 1e-5 s while GC and ASCOT5 confine them. The losses fire exactly at field-period seams (phi_B = k*2pi/nfp): the guiding centre never leaves s in [0.49,0.51] for hundreds of steps, then is declared lost in one step. Two causes in the chartmap inversion (libneo's chartmap is seam-clean): 1. Stale warm guess across the seam. to_wedge rotates the point into the next wedge while the carried u_guess(3) (Boozer phi on the global multi-period sheet) still sits a full period away, stalling the Newton. Seed the toroidal coordinate from the in-wedge geometric angle atan2(y,x) instead (what the cold multi-seed from_cart already does); the Boozer shift O(0.1 rad) keeps convergence at 1-2 iters. 2. Clamped-edge stall mis-read as a loss. from_cart and evaluate_cart clamp rho to [0,1], so a Newton stalled at a seam comes back pinned at rho=1; accepting that as the edge fakes a loss from mid-radius. Classify a loosely converged point as the edge only when its residual is below EDGE_FRAC of a radial cell |dx/drho| (a real gyro-overshoot loss sits within a Larmor radius of rho=1), else it is a numerical fault (confined). Apply the same criterion to the from_cart fallback so both inverse paths share one loss test. Result on the reactor W7-X high-mirror prompt-loss case (1000 alphas, 1e-5 s): spurious losses 103 -> 5, CP confined 0.853 -> 0.990 (GC 1.000, ASCOT 1.000), genuine ASCOT-matched losses preserved. Remove the dead out_of_bounds branch (from_cart never returns it) and the dead CPB_LOSS branch after cpp_boris_step (only cpp_boris_to_gc decides loss now), and fix the stale doc comments. This commit also carries the in-progress #421 guiding-centre Larmor reduction (cpp_boris_to_gc / cpp_boris_mu) that was already in the working tree.

…seed choice) No behaviour change from b3e4b59 (5 residual spurious, CP confined 0.990 on the reactor W7-X 1e-5 s case). Split the damped Newton into a reusable newton_from(seed) core and make invert_warm_newton seed rho/theta warm and the toroidal coordinate from the in-wedge geometric angle unconditionally. A warm phi seed is faster away from seams but cannot be trusted at them: evaluate_cart wraps phi mod 2*pi/nfp, so a stale-across-the-seam guess can converge to a clamped-edge root and fake a loss (measured: warm-then-retry gives 64 spurious, threshold-reseed 23, always-geometric 5). Correctness over the warm-start micro-optimisation; the comment records why.

…-style mode Same Cartesian (x, v) full-orbit CP as ORBIT_CP6D_BORIS with the same chartmap field (cart_field) and guiding-centre loss test (cpp_boris_to_gc), but advanced by the error-controlled adaptive Cash-Karp stepper (odeint_allroutines, the integrator the GC RK path already uses) instead of the fixed Boris step. This is the independent cross-check on the Boris fixed-step phase error, matching the adaptive gyro-orbit ASCOT5 runs as reference. cpp_rk_step integrates one macrostep of the Lorentz ODE dx/dt = v, dv/dt = qcm v x B to relative tolerance st%rtol (= namelist relerr); cp_rk_rhs sources B from the same chartmap and carries a threadprivate warm guess updated per sub-step (the adaptive step can move many gyroradii). Wired through orbit_model = 8 in simple/simple_main (validate, field setup, init via init_cp_boris, macrostep dispatch). Known limitation: at a coarse macrostep the adaptive sub-steps overshoot the LCFS and cart_field faults there rather than returning the clamped-edge field, so the macrostep aborts as a numerical fault; a fine macrostep (npoiper2 like Boris) keeps excursions small and the fault rate Boris-like, but is slow for edge orbits. Graceful past-edge field handling and event-based loss detection (odeint ode_event_t) would let it use coarse adaptive steps; tracked for follow-up.

…s test The chartmap field matches ASCOT5's B_STS to 0.01% in |B| and 0.01deg in direction, the integrator is equivalent (both fixed-step, SIMPLE finer), relativity is negligible (gamma-1=9.4e-4) and energy is machine-exact, so the residual CP-vs-ASCOT over-loss is in the loss detector, not the orbit (see benchmark run_simple/.../orbit_cmp). The loss keys on the Larmor-corrected guiding centre u_gc (the particle, gyro-excursed past s=1, is off-chart and cannot be inverted), but u_gc is a second cold-guess locate of x_gc and carries the residual field-period-seam noise, occasionally returning rho>=1 for a particle that is at mid-radius. Confirm a flagged loss with the robust warm-started particle locate u_p: a real loss has the particle within a Larmor radius of the edge (|x_gc-x| is a Larmor radius), so u_gc>=1 while u_p is well inside (GC_PARTICLE_GAP=0.05, several Larmor radii) is a reconstruction glitch, not a loss. The guiding centre is still reported in (s,theta,phi). At 1e-5 s spurious losses 5 -> 3, CP confined 0.990 -> 0.996.

krystophny · 2026-06-22T07:57:33Z

Status update: branch pivoted to Cartesian-chartmap CP/CPP, now matches ASCOT5

This branch has moved well past the original title (VMEC curvilinear 6D integrator).
The CP/CPP 6D full orbit now integrates in Cartesian (x, v) with the chartmap
as field+geometry source (orbit_cpp_boris.f90), per issues #419/#420/#421. Summary
of where it stands:

Validation (reactor W7-X high-mirror, 1000 alphas, 1 ms): SIMPLE CP full orbit
confined fraction 0.896, matching ASCOT5 full orbit (0.898) within the 1000-marker
statistics; SIMPLE GC 0.905, ASCOT5 GC 0.904. CP went 0.794 -> 0.875 -> 0.896 as two
purely-numerical bugs in the chartmap inversion were fixed:

86b2b2f (+ b3e4b59, 5df5a1d) -- field-period-seam Newton stall faking losses:
in-wedge geometric toroidal seed + residual-vs-radial-cell loss criterion.
703404f -- the guiding-centre loss reconstruction locate(x_gc) carried residual
seam noise; a flagged loss is confirmed against the robust particle locate.

The orbit-level cross-check (benchmark cartesian_cp_test/orbit_cmp/) ruled out every
physics difference: the chartmap field matches ASCOT's B_STS to 0.01% in |B|, 0.01
deg in direction; the integrators are equivalent (both fixed-step Boris -- ASCOT FO
is simulate_fo_fixed, SIMPLE finer at ~38 vs 20 steps/gyroperiod); relativity is
negligible; energy is machine-exact.

New: 5df5a1d adds ORBIT_CP6D_RK = 8, an adaptive Cash-Karp RK45 full-orbit CP
(libneo odeint_allroutines), the ASCOT5-style independent integrator cross-check.

Long runs: GC / CP-Boris / CP-RK at 100 ms and 1 s are scheduled on acluster (7-day
SLURM walltime). libneo chartmap inversion fixes are in PR itpplasma/libneo#325.

Suggest retitling this PR to reflect the Cartesian-chartmap CP/CPP port, or splitting
the merged work out. Benchmark and full orbit-level writeup: private
itpplasma/benchmark-orbit-proxima#2.

Unify the times_lost.dat convention across integrators. The CP/CPP chartmap paths produce ierr_orbit==3 field-locate faults (near-axis), which GC never has; the fault branch wrote times_lost=-1 for these confined survivors while every clean-confined particle (GC included) gets trace_time. -1 then collided with the never-traced (skipped passing) marker value, so a script keying 'confined' on times_lost==trace_time miscounted CP. A fault is not a loss and the particle is confined, so record trace_time like GC; the fault is tracked by the diag counters. Verified: GC and CP now write an identical convention (309 skipped t=-1, rest confined t=trace_time, lost 0<t<trace_time).

Adaptive RK45 full-orbit CP scores confined particles as lost: adaptive substeps overshoot the LCFS and the chartmap field faults, aborting the macrostep. validate_orbit_model_config now error-stops on ORBIT_CP6D_RK and directs users to ORBIT_CP6D_BORIS until graceful past-edge field and odeint loss-event handling land.

## What Reindent 31 files with `fprettify` to the formatting the 6D-port branch (#408) already uses, so that PR's diff carries real changes only. This is formatting churn lifted out ahead of it: ~7600 of #408's ~18900 changed lines are reindentation. `fprettify` 0.3.7 with the repo `.fprettify` config (4-space indent, 88 cols) produces these files from `main`. `get_canonical_coordinates.F90` is the branch's whole-file reformat, taken verbatim (zero real change). ## No semantic change ``` $ git diff -w --ignore-blank-lines origin/main (empty) ``` No token changes anywhere: only indentation and blank-line normalization. Build passes (`make CONFIG=Fast`). ## After merge Rebasing #408 on top drops the reformatted files from its diff, since the formatting matches. The seven files `fprettify` does not reproduce (`orbit_symplectic`, `classification`, `simple`, `params`, `field_can_test`, `simple_main`, plus the field-can hand-formatting) stay in #408 with their real changes.

Brings in the fprettify formatting split (#425). Resolved every conflict to the branch side: main's contribution to those files is formatting only, already present here, so the merge tree is byte-identical to the pre-merge branch tip. Drops ~3750 lines of reindentation from this branch's diff.

## Risk tier - [x] T3: physics, output behavior, coordinate convention ## Correctness contract ### Intended behavior change Adds a new orbit model: `orbit_model = 7` (`ORBIT_FULL_ORBIT`) selects a gyro-resolved full-orbit (FO) Boris pusher, the ASCOT-style counterpart to the guiding-center (GC) model. A charged particle advances by explicit Boris drift-rotate-drift in Cartesian (x, v); field and geometry come from the production Boozer field through the chartmap (B = curl A, tangent to the flux surface), with the magnetic axis healed by a pseudo-Cartesian (X,Y,phi) chart. Output `z(1:5)` is the guiding-centre reduction each step, so `times_lost` and `confined_fraction` are written exactly as for GC. ### Behavior that must not change `orbit_model = 0` (guiding center, the default) is untouched: the symplectic GC path, its init, and its macrostep are unchanged. No existing test output moves. ### Coordinate / unit conventions Cartesian (x, v) in the scaled chartmap frame; logical chart u=(rho, theta_B, phi_B) with rho = sqrt(s). CGS-style normalized velocity with the GC sqrt(2) convention, matching the GC seed so both integrators start from the identical particle. `orbit_coord = 1` (Boozer) is required and enforced. ### Numerical invariants Energy is conserved to machine precision (the Boris rotation is exact for constant B over a step; no electric field here). The only confinement loss is the guiding-centre crossing s >= 1; a field-inversion non-convergence is a numerical fault (`fo_fault`), reported but never counted as a loss. ## Tests added - unit: - integration: `test_fo_boris` (energy bound < 1e-3, near-axis crossing, no spurious loss; passing/trapped/near-axis on the reactor-scale Boozer field) - system: - golden record: ## Golden-record impact - [x] unchanged GC is the default and is untouched; FO is opt-in via `orbit_model = 7`. ## Failure modes considered - Field-inversion non-convergence near a field-period seam: classified as a numerical fault (`fo_fault`), never a loss, so a mid-radius particle is not spuriously lost. - Near-axis singularity of the polar chart: healed by the pseudo-Cartesian (X,Y,phi) chart; the near-axis test orbit crosses s in [0.04, 0.07] with energy bounded. - Unsupported combinations (wall loss, collisions, orbit classification, or orbit_coord /= 1) stop at config validation with a clear message instead of running silently wrong. ## Manual validation Reactor-scale W7-X benchmark (separate run): FO confined fraction agrees with ASCOT5 full orbit (0.894 vs 0.898 at 1 ms, 1000 alphas), and with GC within marker statistics. ## Verification Before: there is no full-orbit model on `main`; `orbit_model` and the FO modules do not exist. ``` $ git grep -l 'ORBIT_FULL_ORBIT\|orbit_fo_boris' origin/main -- src test (no matches) ``` After: `test_fo_boris` passes and the fast suite is green. ``` $ make test TEST=fo_boris 1/1 Test #6: test_fo_boris .................... Passed 6.01 sec passing s band [0.5000,0.7700] max|dE/E0|=2.22E-14 ierr_lost=0 trapped s band [0.4999,0.5894] max|dE/E0|=1.11E-14 ierr_lost=0 drift=8.85E-03 near-axis s band [0.0400,0.0708] max|dE/E0|=8.88E-15 ierr_lost=0 ALL FO-BORIS TESTS PASSED $ make test-fast 100% tests passed, 0 tests failed out of 53 ``` ## Relationship to #408 This is the working full-orbit Boris slice, extracted clean off `main`. The experimental models on #408 (CPP Pauli, the adaptive RK45 ORBIT_CP6D_RK, the 6D canonical CP/CPP, device/mock variants) stay on that branch; #408 rebases on top once this merges.

Brings in the merged full-orbit (FO) Boris pusher (orbit_model=7, orbit_fo_boris/orbit_fo_field, test_fo_boris). The four files both branches touch (params, diag_counters, simple, simple_main) kept this branch's experimental CP/CPP/RK code; the FO modules build alongside under distinct names and test_fo_boris passes. ORBIT_FULL_ORBIT vs this branch's ORBIT_CP6D_BORIS both claim model 7, so the field/dispatch dedup (adopt FO as the canonical model 7, renumber the experimental models) is a follow-up.

krystophny · 2026-06-23T18:58:20Z

Closing as retired. The curvilinear/canonical full-orbit-port investigation is superseded: the working full orbit shipped as FO (orbit_model=7, src/orbit_fo_boris.f90) via #426, and the curvilinear CP/CPP path is retired per #420 (move to Cartesian); the canonical-CPP bounce-resonance issue is tracked in #417. Branch preserved as z_archive/.

krystophny added 6 commits June 12, 2026 21:54

Merge remote-tracking branch 'origin/refactor/shared-symplectic-core'…

b72d625

… into feature/simple-6d-port

krystophny added tier/T3 physics or output behavior size/L review size up to 1000 changed lines labels Jun 19, 2026

krystophny temporarily deployed to github-pages June 19, 2026 17:21 — with GitHub Actions Inactive

krystophny temporarily deployed to github-pages June 19, 2026 17:50 — with GitHub Actions Inactive

krystophny changed the title ~~Add 6D canonical-midpoint integrator (cp/cpp_sym/cpp_var) on analytic tokamak~~ Generalize 6D canonical integrator to arbitrary curvilinear coordinates (VMEC) Jun 19, 2026

krystophny temporarily deployed to github-pages June 19, 2026 19:28 — with GitHub Actions Inactive

krystophny temporarily deployed to github-pages June 19, 2026 20:52 — with GitHub Actions Inactive

krystophny temporarily deployed to github-pages June 19, 2026 22:35 — with GitHub Actions Inactive

krystophny temporarily deployed to github-pages June 19, 2026 23:58 — with GitHub Actions Inactive

krystophny added 2 commits June 20, 2026 06:20

krystophny temporarily deployed to github-pages June 20, 2026 04:57 — with GitHub Actions Inactive

krystophny temporarily deployed to github-pages June 21, 2026 12:38 — with GitHub Actions Inactive

krystophny temporarily deployed to github-pages June 21, 2026 13:00 — with GitHub Actions Inactive

krystophny added 6 commits June 21, 2026 16:46

krystophny temporarily deployed to github-pages June 21, 2026 20:16 — with GitHub Actions Inactive

krystophny temporarily deployed to github-pages June 21, 2026 20:29 — with GitHub Actions Inactive

krystophny added 5 commits June 21, 2026 22:39

krystophny added 2 commits June 22, 2026 11:20

krystophny mentioned this pull request Jun 22, 2026

Reformat 31 files with fprettify (no semantic change) #425

Merged

krystophny added 2 commits June 22, 2026 16:26

krystophny mentioned this pull request Jun 22, 2026

Add full-orbit (FO) Boris pusher (orbit_model=7) #426

Merged

2 tasks

krystophny added 2 commits June 22, 2026 23:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generalize 6D canonical integrator to arbitrary curvilinear coordinates (VMEC)#408

Generalize 6D canonical integrator to arbitrary curvilinear coordinates (VMEC)#408
krystophny wants to merge 61 commits into
refactor/shared-symplectic-corefrom
feature/simple-6d-port

krystophny commented Jun 19, 2026 •

edited

Loading

Uh oh!

krystophny commented Jun 22, 2026

Uh oh!

krystophny commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

krystophny commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Risk tier

Correctness contract

Intended behavior change

Behavior that must not change

Coordinate / unit conventions

Numerical invariants

Tests added

Golden-record impact

Failure modes considered

Manual validation

Verification

Update: arbitrary curvilinear coordinates on real VMEC

Honest limitations

Verification (curvilinear / VMEC)

Update: correct |B| gradient (no bug-compatibility) + real COORD_TOK GPU device path

FIX A: correct d|B| and analytic Hessian

FIX B: real COORD_TOK GPU device path

No regression

Comment fix

Verification: production CPP wiring (ORBIT_CPP6D)

What changed

No GC regression (gfortran Fast)

New test (energy/mu gate + loss propagation on the production wire)

Honest scope

Update: route the production CPP6D loss path to COORD_VMEC

Failing-before (COORD_CHARTMAP, real W7-X chartmap)

Passing-after (COORD_VMEC, test/test_data/wout.nc)

End-to-end loss run (simple.x, BOOZER on wout.nc, 32 particles, trace_time=1e-3 s)

No GC regression (gfortran Fast)

Notes

Verification: production CP wiring (ORBIT_CP6D)

npoiper2 determined by energy conservation

CP gyro-center tracks the GC; energy and mu conserved

Field-eval counter and version string

No GC regression (gfortran Fast)

Uh oh!

krystophny commented Jun 22, 2026

Status update: branch pivoted to Cartesian-chartmap CP/CPP, now matches ASCOT5

Uh oh!

krystophny commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

krystophny commented Jun 19, 2026 •

edited

Loading

Passing-after (COORD_VMEC, `test/test_data/wout.nc`)

End-to-end loss run (`simple.x`, BOOZER on wout.nc, 32 particles, trace_time=1e-3 s)